智能论文笔记

随着各种公开的AI伦理原则的共识，差距仍然可以随时采用设计和开发负责任的AI系统。我们研究了来自澳大利亚国家科学研究机构（CSIRO）的研究人员和工程师的实践和经验，他们参与设计和开发AI系统的一系列目的。半结构化访谈用于检查参与者的做法如何与澳大利亚政府提出的一套高级AI伦理原则涉及并对齐。原则包括：隐私保护和安全，可靠性和安全性，透明度和解释性，公平性，竞争性，责任，人以人为本的价值观和人类，社会与环境福祉。研究了研究人员和工程师的见解以及在原则的实际应用中为它们提供的挑战。最后，提供了一系列组织响应，以支持实施高级AI道德原则。

translated by 谷歌翻译

BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric

Mingda Chen , Paul-Ambroise Duquenne , Pierre Andrews , Justine Kao , Alexandre Mourachko , Holger Schwenk , Marta R. Costa-jussà

分类：自然语言处理

2022-12-16

End-to-End speech-to-speech translation (S2ST) is generally evaluated with text-based metrics. This means that generated speech has to be automatically transcribed, making the evaluation dependent on the availability and quality of automatic speech recognition (ASR) systems. In this paper, we propose a text-free evaluation metric for end-to-end S2ST, named BLASER, to avoid the dependency on ASR systems. BLASER leverages a multilingual multimodal encoder to directly encode the speech segments for source input, translation output and reference into a shared embedding space and computes a score of the translation quality that can be used as a proxy to human evaluation. To evaluate our approach, we construct training and evaluation sets from more than 40k human annotations covering seven language directions. The best results of BLASER are achieved by training with supervision from human rating scores. We show that when evaluated at the sentence level, BLASER correlates significantly better with human judgment compared to ASR-dependent metrics including ASR-SENTBLEU in all translation directions and ASR-COMET in five of them. Our analysis shows combining speech and text as inputs to BLASER does not increase the correlation with human scores, but best correlations are achieved when using speech, which motivates the goal of our research. Moreover, we show that using ASR for references is detrimental for text-based metrics.

translated by 谷歌翻译

基于仿真的推理（SBI）是一个有前途的贝叶斯推理框架，可以减轻对分析可能性估计后验分布的需求。使用SBI算法中神经密度估计器的最新进展表明，以大量模拟为代价实现高保真后代的能力。当使用复杂的物理模拟时，这使得他们的应用程序可能非常耗时。在这项工作中，我们着重于使用模拟器的梯度来提高后密度估计的样本效率。我们提出了一种使用可区分模拟器执行神经后验估计（NPE）的新方法。我们展示了梯度信息如何有助于限制后部形状并提高样本效率。

translated by 谷歌翻译

我们为2022年MIP竞争开发的混合整数程序（MIP）提供了一个求解器。鉴于竞争规则确定的计算时间限制了10分钟，我们的方法着重于找到可行的解决方案，并通过分支机构进行改进 - 和结合算法。竞争的另一个规则允许最多使用8个线程。为每个线程提供了不同的原始启发式，该启发式是通过超参数调整的，以找到可行的解决方案。在每个线程中，一旦找到了可行的解决方案，我们就会停止，然后使用嵌入本地搜索启发式方法的分支和结合方法来改善现有解决方案。我们实施的潜水启发式方法的三种变体设法为培训数据集的10个实例找到了可行的解决方案。这些启发式方法是我们实施的启发式方法中表现最好的。我们的分支机构和结合算法在培训数据集的一小部分中有效，并且它设法找到了一个可行的解决方案，以解决我们无法通过潜水启发式方法解决的实例。总体而言，当用广泛的计算能力实施时，我们的组合方法可以在时间限制内解决训练数据集的19个问题中的11个。我们对MIP竞赛的提交被授予“杰出学生提交”荣誉奖。

translated by 谷歌翻译